OcrV1, Main, Exploration, bibRecord, 000208

A survey on Arabic character segmentation

Identifieur interne : 000208 ( Main/Exploration ); précédent : 000207; suivant : 000209

A survey on Arabic character segmentation

Auteurs : Yasser M. Alginahi [Arabie saoudite]

Source :

International journal on document analysis and recognition : (Print) [ 1433-2833 ] ; 2013.

RBID : Pascal:14-0004360

Descripteurs français

Pascal (Inist)
- Reconnaissance optique caractère, Reconnaissance caractère, Caractère manuscrit, Reconnaissance forme, Traitement image, Analyse donnée, Dictionnaire, En ligne, Arabe, Hors ligne, Formule imprimée, Idéogramme, Méthode indirecte, Segmentation image.
Wicri :
- topic : Dictionnaire.

English descriptors

KwdEn :
- Arabic, Character recognition, Data analysis, Dictionaries, Ideogram, Image processing, Image segmentation, Indirect method, Manuscript character, Off line, On line, Optical character recognition, Pattern recognition, Printed form.

Abstract

Arabic character segmentation is a necessary step in Arabic Optical Character Recognition (OCR). The cursive nature of Arabic script poses challenging problems in Arabic character recognition; however, incorrectly segmented characters will cause misclassifications of characters which in turn may lead to wrong results. Therefore, off-line Arabic character segmentation is a difficult research problem and little research has been achieved in this area in the past few decades. This is due to both the cursive nature of Arabic writing in both printed and handwritten forms and the scarcity of Arabic databases and dictionaries. Most of the character recognition methods used in the recognition of Arabic characters are adopted from available methods used on handwritten Latin and Chinese characters; however, other methods are developed only for Arabic character segmentation. This survey presents the description of the Arabic script characteristics with an overview on OCR systems and a comprehensive review mainly on off-line printed Arabic character segmentation techniques.

Affiliations:

Arabie saoudite

Links toward previous steps (curation, corpus...)

to stream PascalFrancis, to step Corpus: 000031
to stream PascalFrancis, to step Curation: 000733
to stream PascalFrancis, to step Checkpoint: 000051
to stream Main, to step Merge: 000211
to stream Main, to step Curation: 000208

Le document en format XML

<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en" level="a">A survey on Arabic character segmentation</title>
<author><name sortKey="Alginahi, Yasser M" sort="Alginahi, Yasser M" uniqKey="Alginahi Y" first="Yasser M." last="Alginahi">Yasser M. Alginahi</name>
<affiliation wicri:level="1"><inist:fA14 i1="01"><s1>Department of Computer Science, College of Computer Science and Engineering, Taibah University, P.O. Box. 344, Al-Madinah Al-Munawarrah</s1>
<s3>SAU</s3>
<sZ>1 aut.</sZ>
</inist:fA14>
<country>Arabie saoudite</country>
<wicri:noRegion>Department of Computer Science, College of Computer Science and Engineering, Taibah University, P.O. Box. 344, Al-Madinah Al-Munawarrah</wicri:noRegion>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">INIST</idno>
<idno type="inist">14-0004360</idno>
<date when="2013">2013</date>
<idno type="stanalyst">PASCAL 14-0004360 INIST</idno>
<idno type="RBID">Pascal:14-0004360</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000031</idno>
<idno type="wicri:Area/PascalFrancis/Curation">000733</idno>
<idno type="wicri:Area/PascalFrancis/Checkpoint">000051</idno>
<idno type="wicri:doubleKey">1433-2833:2013:Alginahi Y:a:survey:on</idno>
<idno type="wicri:Area/Main/Merge">000211</idno>
<idno type="wicri:Area/Main/Curation">000208</idno>
<idno type="wicri:Area/Main/Exploration">000208</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en" level="a">A survey on Arabic character segmentation</title>
<author><name sortKey="Alginahi, Yasser M" sort="Alginahi, Yasser M" uniqKey="Alginahi Y" first="Yasser M." last="Alginahi">Yasser M. Alginahi</name>
<affiliation wicri:level="1"><inist:fA14 i1="01"><s1>Department of Computer Science, College of Computer Science and Engineering, Taibah University, P.O. Box. 344, Al-Madinah Al-Munawarrah</s1>
<s3>SAU</s3>
<sZ>1 aut.</sZ>
</inist:fA14>
<country>Arabie saoudite</country>
<wicri:noRegion>Department of Computer Science, College of Computer Science and Engineering, Taibah University, P.O. Box. 344, Al-Madinah Al-Munawarrah</wicri:noRegion>
</affiliation>
</author>
</analytic>
<series><title level="j" type="main">International journal on document analysis and recognition : (Print)</title>
<title level="j" type="abbreviated">Int. j. doc. anal. recognit. : (Print)</title>
<idno type="ISSN">1433-2833</idno>
<imprint><date when="2013">2013</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt><title level="j" type="main">International journal on document analysis and recognition : (Print)</title>
<title level="j" type="abbreviated">Int. j. doc. anal. recognit. : (Print)</title>
<idno type="ISSN">1433-2833</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass><keywords scheme="KwdEn" xml:lang="en"><term>Arabic</term>
<term>Character recognition</term>
<term>Data analysis</term>
<term>Dictionaries</term>
<term>Ideogram</term>
<term>Image processing</term>
<term>Image segmentation</term>
<term>Indirect method</term>
<term>Manuscript character</term>
<term>Off line</term>
<term>On line</term>
<term>Optical character recognition</term>
<term>Pattern recognition</term>
<term>Printed form</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr"><term>Reconnaissance optique caractère</term>
<term>Reconnaissance caractère</term>
<term>Caractère manuscrit</term>
<term>Reconnaissance forme</term>
<term>Traitement image</term>
<term>Analyse donnée</term>
<term>Dictionnaire</term>
<term>En ligne</term>
<term>Arabe</term>
<term>Hors ligne</term>
<term>Formule imprimée</term>
<term>Idéogramme</term>
<term>Méthode indirecte</term>
<term>Segmentation image</term>
</keywords>
<keywords scheme="Wicri" type="topic" xml:lang="fr"><term>Dictionnaire</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Arabic character segmentation is a necessary step in Arabic Optical Character Recognition (OCR). The cursive nature of Arabic script poses challenging problems in Arabic character recognition; however, incorrectly segmented characters will cause misclassifications of characters which in turn may lead to wrong results. Therefore, off-line Arabic character segmentation is a difficult research problem and little research has been achieved in this area in the past few decades. This is due to both the cursive nature of Arabic writing in both printed and handwritten forms and the scarcity of Arabic databases and dictionaries. Most of the character recognition methods used in the recognition of Arabic characters are adopted from available methods used on handwritten Latin and Chinese characters; however, other methods are developed only for Arabic character segmentation. This survey presents the description of the Arabic script characteristics with an overview on OCR systems and a comprehensive review mainly on off-line printed Arabic character segmentation techniques.</div>
</front>
</TEI>
<affiliations><list><country><li>Arabie saoudite</li>
</country>
</list>
<tree><country name="Arabie saoudite"><noRegion><name sortKey="Alginahi, Yasser M" sort="Alginahi, Yasser M" uniqKey="Alginahi Y" first="Yasser M." last="Alginahi">Yasser M. Alginahi</name>
</noRegion>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Exploration

HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000208 | SxmlIndent | more

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000208 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     Pascal:14-0004360
   |texte=   A survey on Arabic character segmentation
}}

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024

	Serveur d'exploration sur l'OCR
	Attention, ce site est en cours de développement ! Attention, site généré par des moyens informatiques à partir de corpus bruts. Les informations ne sont donc pas validées.

Serveur d'exploration sur l'OCR

A survey on Arabic character segmentation

A survey on Arabic character segmentation

Source :

Descripteurs français

English descriptors

Abstract

Links toward previous steps (curation, corpus...)

Le document en format XML

Pour manipuler ce document sous Unix (Dilib)

Pour mettre un lien sur cette page dans le réseau Wicri